Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent

نویسندگان

Benjamin Recht

Christopher Ré

Stephen J. Wright

Feng Niu

چکیده

Stochastic Gradient Descent (SGD) is a popular algorithm that can achieve stateof-the-art performance on a variety of machine learning tasks. Several researchers have recently proposed schemes to parallelize SGD, but all require performancedestroying memory locking and synchronization. This work aims to show using novel theoretical analysis, algorithms, and implementation that SGD can be implemented without any locking. We present an update scheme called HOGWILD! which allows processors access to shared memory with the possibility of overwriting each other’s work. We show that when the associated optimization problem is sparse, meaning most gradient updates only modify small parts of the decision variable, then HOGWILD! achieves a nearly optimal rate of convergence. We demonstrate experimentally that HOGWILD! outperforms alternative schemes that use locking by an order of magnitude.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Asynchronous Parallel Stochastic Gradient Descent: A Lock-Free Approach with Convergence Guarantee

Stochastic gradient descent (SGD) and its variants have become more and more popular in machine learning due to their efficiency and effectiveness. To handle large-scale problems, researchers have recently proposed several parallel SGD methods for multicore systems. However, existing parallel SGD methods cannot achieve satisfactory performance in real applications. In this paper, we propose a f...

متن کامل

Lock-Free Optimization for Non-Convex Problems

Stochastic gradient descent (SGD) and its variants have attracted much attention in machine learning due to their efficiency and effectiveness for optimization. To handle largescale problems, researchers have recently proposed several lock-free strategy based parallel SGD (LF-PSGD) methods for multi-core systems. However, existing works have only proved the convergence of these LF-PSGD methods ...

متن کامل

Factorbird - a Parameter Server Approach to Distributed Matrix Factorization

We present ‘Factorbird’, a prototype of a parameter server approach for factorizing large matrices with Stochastic Gradient Descent-based algorithms. We designed Factorbird to meet the following desiderata: (a) scalability to tall and wide matrices with dozens of billions of non-zeros, (b) extensibility to different kinds of models and loss functions as long as they can be optimized using Stoch...

متن کامل

Scalable Asynchronous Gradient Descent Optimization for Out-of-Core Models

Existing data analytics systems have approached predictive model training exclusively from a data-parallel perspective. Data examples are partitioned to multiple workers and training is executed concurrently over different partitions, under various synchronization policies that emphasize speedup or convergence. Since models with millions and even billions of features become increasingly common ...

متن کامل

Parallel Stochastic Gradient Descent with Sound Combiners

Stochastic gradient descent (SGD) is a well-known method for regression and classification tasks. However, it is an inherently sequential algorithm — at each step, the processing of the current example depends on the parameters learned from the previous examples. Prior approaches to parallelizing SGD, such as HOGWILD! and ALLREDUCE, do not honor these dependences across threads and thus can pot...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent

نویسندگان

چکیده

منابع مشابه

Fast Asynchronous Parallel Stochastic Gradient Descent: A Lock-Free Approach with Convergence Guarantee

Lock-Free Optimization for Non-Convex Problems

Factorbird - a Parameter Server Approach to Distributed Matrix Factorization

Scalable Asynchronous Gradient Descent Optimization for Out-of-Core Models

Parallel Stochastic Gradient Descent with Sound Combiners

عنوان ژورنال:

اشتراک گذاری